186 research outputs found

    Heterogeneous Entity Matching with Complex Attribute Associations using BERT and Neural Networks

    Full text link
    Across various domains, data from different sources such as Baidu Baike and Wikipedia often manifest in distinct forms. Current entity matching methodologies predominantly focus on homogeneous data, characterized by attributes that share the same structure and concise attribute values. However, this orientation poses challenges in handling data with diverse formats. Moreover, prevailing approaches aggregate the similarity of attribute values between corresponding attributes to ascertain entity similarity. Yet, they often overlook the intricate interrelationships between attributes, where one attribute may have multiple associations. The simplistic approach of pairwise attribute comparison fails to harness the wealth of information encapsulated within entities.To address these challenges, we introduce a novel entity matching model, dubbed Entity Matching Model for Capturing Complex Attribute Relationships(EMM-CCAR),built upon pre-trained models. Specifically, this model transforms the matching task into a sequence matching problem to mitigate the impact of varying data formats. Moreover, by introducing attention mechanisms, it identifies complex relationships between attributes, emphasizing the degree of matching among multiple attributes rather than one-to-one correspondences. Through the integration of the EMM-CCAR model, we adeptly surmount the challenges posed by data heterogeneity and intricate attribute interdependencies. In comparison with the prevalent DER-SSM and Ditto approaches, our model achieves improvements of approximately 4% and 1% in F1 scores, respectively. This furnishes a robust solution for addressing the intricacies of attribute complexity in entity matching

    The Washback Effect of Reformed CET 6 Listening Comprehension Test

    Get PDF
    In China, the English Test Band 6 (CET6) is a national test that is used to assess the English proficiency of test-takers by the state with unified questions, unified fees, and unified organization of tests. It is held twice a year. This test has had a great impact on college students and college teachers. It was introduced in 1978. In 2016, the Ministry of Education reformed CET-6, especially in listening tests. The reformed listening test not only brings scenes and dialogues closer to daily life but also emphasizes the examination of students' comprehensive English listening and speaking ability. From the perspective of learners, this paper draws on the theoretical models and empirical results of washback at home and abroad and studies the backwash of the reformed English CET-6 listening to learners' listening learning through a questionnaire. To do the survey, the paper was surveyed by quantitative research methods with 60 samples in several public universities. After the collection and analysis of data, the authors have affirmed and determined this test has a significant washback effect on student learning

    Effects of Biofuel Policies on World Food Insecurity -- A CGE Analysis

    Get PDF
    The food vs. fuel debate has heated up since the 2008 global food crisis when major crop prices dramatically increased. Heavily subsidized biofuel production was blamed for diverting food crops from food production and diverting resources from food and feed production, triggering a food crisis globally and leading to increases in the world food insecure population. Few studies have quantified the effects of biofuel policies on world food prices and world food insecurity. This study added the Brazil and China's biofuel sectors to an existing global trade CGE model, and applies the measurement of food insecurity as developed by FAO. Alternative scenarios were food insecurity. Results are examined with focus on (1) effects on domestic biofuel productions, (2) change in food commodity productions and trade, (3) change in land use and land rents, and (4) change in regional undernourished populations. Results indicated that biofuel expansion is not cost competitive to traditional fossil fuel. Without any policy incentives, huge expansion of biofuel production is not likely under current technology. The conventional biofuel mandates in U.S., Brazil and China lead to increases in world food insecurity, while the advanced biofuel mandate in U.S. has the opposite effect. Subsidies to biofuels production help to lessen the increase in world food insecurity that is caused by increases in conventional biofuel production. Additionally, the effects from U.S. biofuel policies are smaller but more widespread than the effects from Brazil or China's biofuel policies. Overall, the long term effects of biofuel production expansion on world food insecurity are much smaller than expected

    DeepOrgan: Multi-level Deep Convolutional Networks for Automated Pancreas Segmentation

    Full text link
    Automatic organ segmentation is an important yet challenging problem for medical image analysis. The pancreas is an abdominal organ with very high anatomical variability. This inhibits previous segmentation methods from achieving high accuracies, especially compared to other organs such as the liver, heart or kidneys. In this paper, we present a probabilistic bottom-up approach for pancreas segmentation in abdominal computed tomography (CT) scans, using multi-level deep convolutional networks (ConvNets). We propose and evaluate several variations of deep ConvNets in the context of hierarchical, coarse-to-fine classification on image patches and regions, i.e. superpixels. We first present a dense labeling of local image patches via P−ConvNetP{-}\mathrm{ConvNet} and nearest neighbor fusion. Then we describe a regional ConvNet (R1−ConvNetR_1{-}\mathrm{ConvNet}) that samples a set of bounding boxes around each image superpixel at different scales of contexts in a "zoom-out" fashion. Our ConvNets learn to assign class probabilities for each superpixel region of being pancreas. Last, we study a stacked R2−ConvNetR_2{-}\mathrm{ConvNet} leveraging the joint space of CT intensities and the P−ConvNetP{-}\mathrm{ConvNet} dense probability maps. Both 3D Gaussian smoothing and 2D conditional random fields are exploited as structured predictions for post-processing. We evaluate on CT images of 82 patients in 4-fold cross-validation. We achieve a Dice Similarity Coefficient of 83.6±\pm6.3% in training and 71.8±\pm10.7% in testing.Comment: To be presented at MICCAI 2015 - 18th International Conference on Medical Computing and Computer Assisted Interventions, Munich, German

    An Efficient Built-in Temporal Support in MVCC-based Graph Databases

    Full text link
    Real-world graphs are often dynamic and evolve over time. To trace the evolving properties of graphs, it is necessary to maintain every change of both vertices and edges in graph databases with the support of temporal features. Existing works either maintain all changes in a single graph or periodically materialize snapshots to maintain the historical states of each vertex and edge and process queries over proper snapshots. The former approach presents poor query performance due to the ever-growing graph size as time goes by, while the latter one suffers from prohibitively high storage overheads due to large redundant copies of graph data across different snapshots. In this paper, we propose a hybrid data storage engine, which is based on the MVCC mechanism, to separately manage current and historical data, which keeps the current graph as small as possible. In our design, changes in each vertex or edge are stored once. To further reduce the storage overhead, we simply store the changes as opposed to storing the complete snapshot. To boost the query performance, we place a few anchors as snapshots to avoid deep historical version traversals. Based on the storage engine, a temporal query engine is proposed to reconstruct subgraphs as needed on the fly. Therefore, our alternative approach can provide fast querying capabilities over subgraphs at a past time point or range with small storage overheads. To provide native support of temporal features, we integrate our approach into Memgraph, and call the extended database system TGDB(Temporal Graph Database). Extensive experiments are conducted on four real and synthetic datasets. The results show TGDB performs better in terms of both storage and performance against state-of-the-art methods and has almost no performance overheads by introducing the temporal features

    Phosphorylation of NF-κB in Cancer

    Get PDF
    The proinflammatory transcription factor nuclear factor-κB (NF-κB) has emerged as a central player in inflammatory responses and tumor development since its discovery three decades ago. In general, aberrant NF-κB activity plays a critical role in tumorigenesis and acquired resistance to chemotherapy. This aberrant NF-κB activity frequently involves several post-translational modifications of NF-κB, including phosphorylation. In this chapter, we will specifically cover the phosphorylation sites reported on the p65 subunit of NF-κB and their relationship to cancer. Importantly, phosphorylation is catalyzed by different kinases using adenosine triphosphate (ATP) as the phosphorus donor. These kinases are frequently hyperactive in cancers and thus may serve as potential therapeutic targets to treat different cancers
    • …
    corecore